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Six phototrophic microbial mat communities from different geothermal springs (YNP) were 
studied using metagenome sequencing and geochemical analyses. The primary goals of 
this work were to determine differences in community composition of high-temperature 
phototrophic mats distributed across the Yellowstone geothermal ecosystem, and to iden- 
tify metabolic attributes of predominant organisms present in these communities that may 
correlate with environmental attributes important in niche differentiation. Random shot- 
gun metagenome sequences from six phototrophic communities (average ~53 Mbp/site) 
were subjected to multiple taxonomic, phylogenetic, and functional analyses. All methods, 
including G + C content distribution, MEGAN analyses, and oligonucleotide frequency- 
based clustering, provided strong support for the dominant community members present 
in each site. Cyanobacteria were only observed in non-sulfidic sites; de novo assem- 
blies were obtained for Synechococcus-l ike populations at Chocolate Pots (CP_7) and 
Fischerella-Wke populations at White Creek (WC_6). Chlorof lexi-like sequences (esp. Rosei- 
flexus and/or Chloroflexus spp.) were observed in all six samples and contained genes 
involved in bacteriochlorophyll biosynthesis and the 3-hydroxypropionate carbon fixation 
pathway. Other major sequence assemblies were obtained for a Chlorobiales population 
from CP_7 (proposed family Thermochlorobacteriaceae), and an anoxygenic, sulfur-oxidizing 
Thermochromatium-Wke (Gamma-proteobacteria) population from Bath Lake Vista Annex 
(BLVA_20). Additional sequence coverage is necessary to establish more complete assem- 
blies of other novel bacteria in these sites (e.g., Bacteroidetes and Firmicutes); however, 
current assemblies suggested that several of these organisms play important roles in het- 
erotrophic and fermentative metabolisms. Definitive linkages were established between 
several of the dominant phylotypes present in these habitats and important functional 
processes such as photosynthesis, carbon fixation, sulfur oxidation, and fermentation. 

Keywords: microbial mats, microbial interactions, phototrophic bacteria, functional genomics, thermophilic 
bacteria 



INTRODUCTION 

Many naturally occurring microorganisms have eluded isolation, 
due in part to a poor understanding of the chemical, physical, and 
biotic factors defining their realized niches (Rappe and Giovan- 
noni, 2003). Moreover, much of the sequence diversity revealed 
by amplification of specific gene targets (e.g., 16S rRNA) is sus- 
ceptible to biases inherent in primer-design and PCR protocols. 



Random shotgun sequencing of environmental DNA provides a 
direct and potentially less biased view of the composition and 
functional attributes of microbial communities. For example, 
three new chlorophototrophic organisms (i.e., organisms capa- 
ble of (bacterio) chlorophyll-based phototrophy) were discovered 
in prior metagenome analyses of oxygenic mats in YNP, two of 
which lie outside the clades of known phototrophic organisms 
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in the Chlorobiales and Chloroflexi (Klatt et al., 2011). More- 
over, the third organism, "Candidatus Chloracidobacterium ther- 
mophilum" ("Ca. C. thermophilum"), represents the only known 
occurrence of chlorophototrophy in the phylum Acidobacteria 
(Bryant et al, 2007; Klatt et al, 2011; Garcia Costas et al., 2012). 
Metagenome sequencing and subsequent bioinformatic analyses 
provide an opportunity to identify the metabolic attributes of 
uncultivated organisms that can be used to postulate detailed 
biochemical linkages among individual community members nec- 
essary for the development of computational models describing 
microbial interaction and community function (Taffs et al., 2009). 

High-temperature phototrophic microbial mats have served as 
models for studying microbial community structure and func- 
tion. Studies have included investigations of microbial community 
composition (Miller et al., 2009), the ecophysiology of novel iso- 
lates (Pierson and Castenholz, 1974; Bryant et al., 2007; van der 
Meer et al, 2010), comparative genomics, metagenomics, and 
metatranscriptomics (Bhaya et al., 2007; Klatt et al., 2007, 2011; 
Becraft et al, 2011; Liu et al, 2011, 2012; Melendrez et al, 2011), 
community network modeling (Taffs et al., 2009), phage-host 
interactions (Heidelberg et al., 2009), as well as theoretical mod- 
els of evolution (Ward et al., 2008). The high temperature and 
relative geo chemical stability of geo thermal phototrophic mats in 
YNP generally result in communities with several dominant phylo- 
types and have provided opportunities for understanding environ- 
mental factors controlling community composition (Brock, 1978; 
Cohen and Rosenberg, 1989; Ward and Castenholz, 2000; Ward 
et al., 2012). Prior investigations have revealed that temperature, 
pH, and sulfide are among the most important environmental 
variables dictating differences in phototrophic mat community 
structure (Castenholz, 1976, 1977; Castenholz and Pierson, 1995; 
Madigan et al, 2005; Cox et al., 201 1; Boyd et al, 2012). The pres- 
ence of sulfide was used in the current study to separate anoxygenic 
versus oxygenic communities common in YNP (Inskeep et al., 
2013). Oxygenic and/or anoxygenic photoautotrophs are gener- 
ally the predominant primary producers in geothermal mats at 
temperatures of ~50-72°C and moderately acidic to alkaline pH 
(5-9). These mat communities support a diverse array of (photo-) 
heterotrophic, fermentative, sulfate-respiring, and methanogenic 
organisms, whose physiological attributes are critical for under- 
standing community function (Zeikus and Wolfe, 1972; Jackson 
et al, 1973; Henry et al., 1994; Nold and Ward, 1996; Ward et al, 
1998; Taffs et al, 2009; Klatt et al, 2011; Liu et al, 2012). 

The distribution of different chlorophototrophic bacteria is 
often controlled by specific geochemical parameters. For exam- 
ple, members of the Cyanobacteria are not generally found in 
acidic or sulfidic environments (Castenholz, 1976, 1977). However, 
filamentous anoxygenic phototrophs (FAPs) of the phylum Chlo- 
roflexi exhibit a wider habitat range than other chlorophototrophs. 
Closely related members of the Chloroflexi [>97% nucleotide 
identity (NT ID) of the 16S rRNA gene] with different pheno- 
types have been cultured from geothermal environments (Madi- 
gan et al., 1974; Madigan and Brock, 1975). FAPs isolated from 
a high-sulfide (>100 |xM) spring in the absence of cyanobacteria 
(Chloroflexus sp. GCF strains) fixed inorganic carbon using sulfide 
as the electron donor (Giovannoni et al., 1987). However, most 
other cultured Chloroflexus spp. from low-sulfide environments 



are photoheterotrophic and do not utilize reduced sulfur for pho- 
tosynthesis (Madigan et al., 1974; Pierson and Castenholz, 1974). 
Natural populations of FAPs are known to consume organic com- 
pounds produced by cyanobacterial community members (van 
der Meer et al., 2005); however, genomic and biochemical evi- 
dence is needed to improve our understanding of how different 
populations of Chloroflexi function in situ. 

The overall goal of this study was to investigate the underly- 
ing environmental factors and potential physiological adaptations 
important in defining the microbial community structure and 
function of different types of chlorophototrophic mats commonly 
found in association with certain geothermal features of YNP 
(Inskeep et al, 2013). The specific objectives of this study were 
to (i) utilize metagenome sequencing and bioinformatic analyses 
to determine the community composition of thermal chloropho- 
totrophic mats in YNP, (ii) identify key metabolic attributes of the 
major chlorophototrophic organisms present in these commu- 
nities, and (iii) evaluate the predominant environmental and/or 
geochemical attributes that contribute to niche differentiation 
of thermophilic chlorophototrophic communities. The habitats 
sampled in the current study were chosen to focus on several 
of the major high-temperature phototrophic mat types that are 
distributed across the YNP geothermal ecosystem. 

RESULTS 

GEOCHEMICAL AND PHYSICAL CONTEXT 

The predominant differences among the six phototrophic micro- 
bial mat communities included both geochemical characteris- 
tics such as pH and dissolved sulfide (DS), as well as temper- 
ature, and the sample depth (Figure 1; Table 1). Temperature 
ranged from 40-60°C across these six sites, and is a critical para- 
meter controlling community composition. Four of the geot- 
hermal sites contained no measurable DS, while both samples 
from Bath Lake Vista Annex Spring (BLVA_5 and BLVA_20) 
were collected from hypoxic sulfidic environments (total DS 
~117 |xM). Although the dissolved oxygen content at the source 
of Chocolate Pots (near sample location CP_7) was below detec- 
tion (<1 |xM), this spring contained no sulfide and high con- 
centrations of Fe (II) (~76|xM) (Table 1), which results in 
the precipitation of Fe( III) -oxides upon discharge and reac- 
tion with oxygen (Trouwborst et al., 2007). The phototrophic 
mat obtained from White Creek (WC_6) occurs within an oxy- 
genated, alkaline-siliceous geothermal drainage channel contain- 
ing no detectable DS (Table 1). The site was included in the 
study to target a population of the heterocyst- forming cyanobac- 
terium Fischerella (Mastigocladus) laminosus that has been the 
focus of prior work at this location (Miller et al., 2006, 2007, 
2009). 

Samples from Mushroom Spring (MS_15) and Fairy Geyser 
(FG_16) were obtained from laminated phototrophic mats after 
removal of the top layer (See Materials and Methods). Dissection 
of these mats was performed to focus on FAPs, which were known 
to occur in higher abundance at greater depths below a surface 
layer dominated by cyanobacteria (Boomer et al., 2002; Niibel 
et al., 2002). The phototrophic mats at FG_16 are referred to as 
"splash-mats" due to the fact that these communities receive fre- 
quent inputs of geothermal water emanating from the main source 
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FIGURE 1 | Site photographs of phototrophic microbial mats selected 
for metagenome sequencing. The sites cover a range in geochemical 
conditions including (i) highly sulfidic environments at Bath Lake Vista 
Annex (BLVA_5, 20), (ii) oxygenic phototrophic communities at White 
Creek (WC_6) and Chocolate Pots (CP_7), and (iii) subsurface mat layers at 
Mushroom Spring (MS_15) and Fairy Geyser (FG_16) (also oxygenic 



systems). The anoxygenic phototrophic communities at Bath Lake Vista 
Annex (BLVA) were sampled at two different time points (Table S2 in 
Inskeep et al., 2013) to compare Chloroflexus mats in the absence 
(BLVA_5) and presence (BLVA_20) of purple-bacteria (Arrows indicate 
approximate sample locations and types; inset at BLVA_5 shows mat 
dissection at sampling). 



Table 1 | Sample locations and aqueous geochemical parameters 1 of six, high-temperature phototrophic microbial communities sampled in 
Yellowstone National Park (YNP) and used for metagenome sequencing. 



Location 


T(°C) 


PH 


Na+ 


ci- 


so* 


1 DIC 


1 DS 


1 DO 


1 DOC 


As 


2 Fe 


3 NH+ 


Coordinates 












mM 


















Bath Lake Vista 
Annex-Green (BVLA_5) 


57 


6.2 


3.9 


4.4 


5.6 


15.8 


117 


<3 


104 


24 


0.7 


40 


44° 57' 54.180" N 
110° 42' 42.228" W 


Bath Lake Vista 
Annex-Purple (BLVA_20) 


54 


6.2 


5.5 


5.7 


7.3 


24.2 


117 


<3 


75 


23 


0.7 


40 


44° 57' 54.180" N 
110° 42' 42.228" W 


White Creek (WC_6) 


52 


8.2 


3.6 


1.8 


0.23 


nd 


<3 


188 


nd 


5 


1.7 


1.9 


44° 31 7 53.399" N 
110° 47' 51.799" W 


Chocolate Pots (CP_7) 


52 


6.2 


4.1 


0.89 


0.23 


13.2 


<3 


<3 


38 


9 


75.5 


4.2 


44° 42' 36.288" N 
110° 44' 28.824" W 


Mushroom Spring 
(MS_15) 


60 


8.2 


12.6 


7.3 


0.18 


2.1 


<3 


141 


nd 


26 


<1 


4.4 


44° 32' 19.284" N 
110° 47' 52.692" W 


Fairy Geyser (FG_16) 


36-38 


9.1 


9.4 


5.2 


0.18 


4.8 


<3 


31 


30 


13 


<1 


1.3 


44° 32' 31.812" N 
110° 5V 40.788" W 


Correlation (r 2 ) 




0.89' 




0.93 


' 0.96* 




0.99*" 








0.72" 


0.99" 





7 DS, total dissolved sulfide; DO, dissolved oxygen; DIC, dissolved inorganic carbon; DOC, dissolved organic carbon. 

2 Mn (total soluble) values were also significant in CP_7 (24\lM) andWC_6 (5\lM), but low in other sites (0.1-0.2 \lM, or below detection of 0.1 [iM). 

3 Nitrate values ranged from 2. 1-6. 7\xM across sites. 

4 Correlation significance values: *p<0.05, **p<0.01, ***p< 0.001. 



pool (85-88°C) (Figure 1). The "splash-mats" surrounding FG_16 temperature range of 35-50°C and a pH approaching 9 (Boomer 
are reasonably thick (~3-5 cm), and the sample discussed here et al., 2000, 2002). The visual characteristic of the "red-layer" was 
was collected from a 2-4 mm "red-layer," found within a apparent during sampling and represents a different subsurface 
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environment than the sample obtained from MS_15. No mea- 
surable DS was present in the bulk aqueous phase (Table 1) of 
these mats; however, subsurface mats in these systems (MS_15 
and FG_16) have been shown to be less oxic than their respective 
near-surface layers (Jensen et al., 201 1). 

ANALYSIS OF METAGENOME SEQUENCES 

Individual sequences (average length ~800bp) were analyzed 
using several complementary approaches including alignment- 
based comparisons to reference databases, and evaluation of the 
guanine and cytosine content (% G + C) of each sequence read. 
In addition, comparison of all sequences to the NCBI nr database 
(blastx) was accomplished using MEGAN (Huson et al., 2007). The 
most highly represented phyla across all sites included the Chlo- 
roflexi (28%), Cyanobacteria (12%), Proteobacteria (8%), Bac- 
teroidetes (6%), and Chlorobi (2%). Many sequence reads (27%) 
did not match those available in NCBI ("no hits"); this indicated 
that some members of these communities are not represented in 
current databases. 

Taxonomic assignment of individual sequences was combined 
with %G + C distribution to obtain a profile of community com- 
position (Figure 2). Each site contained populations similar to 
Chloroflexus and/or Roseiflexus spp., with average G + C contents 
of 55 and 61%, respectively. The two sulfidic samples (BLVA_5 
and BLVA_20) showed contributions from both Chloroflexus and 
Roseiflexus -like populations (Figure 2). The oxic community from 
White Creek (WC_6) also contained significant contributions from 
Chloroflexus -like organisms, while CP_7, MS_15 and FG_16 were 
enriched in Roseiflexus -like sequences (Figure 2). All sites contain 
a significant number of sequences contributed from novel Chlo- 
roflexi that have not been adequately characterized, and for which 
appropriate reference organisms have not yet been cultivated or 
sequenced. 

The phototrophic mat communities from WC_6 and CP_7 
contained a significant fraction of sequences (23 and 25%, 



respectively) contributed from members of the Cyanobacteria. 
Both sites contained sequences related to Synechococcus spp. 
strains A and B r (mean G + C content of 60%; Bhaya et al, 
2007) (Figure 2; Figure Al in Appendix), but the WC_6 commu- 
nity yielded a large proportion of Cyanobacteria-like sequences 
(73%) that could not be classified beyond the phylum-level, and 
these sequences exhibit a large range in G + C content (40-65%). 
Fisherella laminosus (order Stigonematales) has been shown to 
be an important community member at WC_6 (Miller et al., 
2009), and many of the cyanobacterial sequences from WC_6 
showed high sequence identity (95% average NT ID of align- 
ments) to the draft genome of Fischerella sp. JSC- 11 (average 
G + C = 41%; Figure A2 in Appendix), which was the only 
representative genome available for this group of cyanobacte- 
ria (at time of writing). The G + C content frequency plots 
also revealed major contributions from organisms within the 
Chlorobi (at sites CP_7 and FG_16), Thermotoga (MS_15), and 
Thermo chromatium spp. (purple-sulfur bacteria) in BLVA_20 
with an average G + C content of 64%. Moreover, all sites con- 
tained bacterial sequences that could not be identified beyond 
the level of Domain Bacteria (especially G + C contents rang- 
ing from 20-40%, Figure 2), in part because appropriate refer- 
ence genomes are not currently available, and significant assem- 
blies were not obtained for phylotypes present in lower abun- 
dance. 

ANALYSIS OF METAGENOME ASSEMBLIES 

The assembly of individual sequence reads into contigs and scaf- 
folds is a powerful method for linking functional attributes with 
specific phylotypes. Assembly yielded scaffolds ranging from 1 kb 
(small contigs) to nearly 126 kb (largest scaffold), and an average 
scaffold size of 2,330 bp across all six sites. Community struc- 
ture plays a role in the degree of assembly and the ability to 
obtain large scaffolds; communities with larger proportions of 
metagenome sequence originating from fewer, more dominant 




■ Roseiflexus 

■ Chloroflexus 

□ Chloroflexi 

□ Cyanobacteria 

■ Gammaproteobacteria 

□ Bacteroidetes 

□ Chlorobi 

□ Thermotogae 

■ Proteobacteria 

■ Acidobacteria 

■ Thermodesulfobacteria 

□ Aquificae 

■ Nitrosoirae 

■ All 



FIGURE 2 | Percent G + C content and taxonomic analysis of random 
shotgun sequence reads obtained from six thermophilic phototrophic 
mat communities from Yellowstone National Park (YNP). The frequency 



plot of all sequence reads (black) versus G + C content (%) is shown with 
corresponding taxonomic analysis (MEGAN-" blastx") as indicated by the color 
key (right). 
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FIGURE 3 | Principal components analysis of oligonucleotide frequencies 
of assembled sequence from Bath Lake Vista Annex. BLVA_20 was 
sampled 8 months after BLVA_5 to capture a bloom of purple-sulfur bacteria 
shown in prior work to be related to Thermochromatium tepidum (Castenholz, 
1977; Ward et al., 1989). Both sites contained scaffolds from dominant 



populations of Chloroflexus spp., Firmicutes, Nitrospira, and additional 
proteobacteria, but only BLVA_20 contained numerous scaffolds 
corresponding to the population of purple-sulfur bacteria 
(Gamma-proteobacteria, family Chromatiaceae, average G + C ~64%) that is 
notably absent in BLVA_5. 



organisms resulted in longer assemblies. Diversity metrics of PCR- 
based 16S rRNA sequences that were produced simultaneously 
from the same samples indicated that subsurface mat communi- 
ties from MS_15 and FG_16 exhibited higher Simpsons diversity 
values (reported as the reciprocal of the Simpson's index, 
Table Al in Appendix). The greater degree of species "evenness" 
in MS_15 and FG_16 yielded considerably smaller assemblies, and 
only two scaffolds >10kb were obtained from each of these two 
sites. Contrastingly, CP_7 exhibited the lowest Simpsons X -1 , and 
the largest assemblies were obtained from this site, which con- 
tributed 42% of the large scaffolds (>10kb) obtained across all 
six sites. Large assemblies were also obtained from the anoxygenic 
mats at BLVA (BLVA_5, _20), and these samples had similarly low 
values for Simpsons X -1 . 

NUCLEOTIDE WORD-FREQUENCY ANALYSIS OF DOMINANT 
POPULATIONS 

Sequence assemblies were examined using principal components 
analysis (PCA) of nucleotide word frequencies (NWF) (Teeling 
et al., 2004) in conjunction with a taxonomic classification algo- 
rithm of average scaffold identity (APIS; Badger et al., 2006). For 
example, NWF PCA plots of the sulfidic system at BLVA sampled 
8 months apart revealed major differences in community compo- 
sition associated with a visible bloom of purple-sulfur bacteria in 
BLVA_20 (Figures 1 and 3). The major change in community com- 
position between the two samples was the Thermochromatium-like 
population in BLVA_20, which corresponded with a decrease in 
Roseiflexus -like sequences (Figure 3). Both BLVA samples revealed 
a dominant Chloroflexus -like population that corresponded to 
the G + C peak at 55% (Figure 2). Similar NWF PCA analyses 
of assemblies from CP_7 revealed three predominant commu- 
nity members related to Roseiflexus, Synechococcus, and "Candi- 
datus Thermochlorobacter aerophilum" -like organisms ("Ca. T. 



aerophilum" represents a novel clade in the order Chlorobiales; Liu 
et al., 2012). Several other organisms were present in lower abun- 
dance and were distantly related to members of the Firmicutes, 
Bacteroidetes, and Spirochetes (Figure A3 in Appendix). The large 
Chlorobi-like assemblies obtained from CP_7 were phylogeneti- 
cally related (average NT ID = 9 1 %) to " Ca. T. aerophilum" assem- 
blies obtained from Mushroom and Octopus Springs metagenomes 
(Klatt et al, 2011; Liu et al., 2012). Translated PscD sequences 
from this newly described lineage of uncultivated Chlorobi are 
clearly distinct from other previously described phototrophic 
Chlorobi (PscD sequences from the CP_7 and Mushroom pop- 
ulations have 95% amino acid identity (AA ID) (Figure A4 in 
Appendix). 

A Monte-Carlo approach was also used to compare normal- 
ized oligonucleotide frequencies across the six phototrophic sites, 
which clustered the scaffolds of highly related organisms (e.g., 
genus/species level). A minimum scaffold length of lOkbp was 
used to focus the analysis on dominant assemblies; consequently, 
smaller scaffolds from subsurface mat communities (MS_15 and 
FG_16) were not well represented in this analysis. Twelve scaf- 
fold clusters (consensus k-means groupings) were observed across 
sites (Figure 4; Table 2), and each of these populations corre- 
sponded with dominant community members identified using 
G + C content (%) and BLASTP assignments (Figure 2; Figure A5 
in Appendix). Clustering by oligonucleotide frequency afforded 
greater discrimination among populations that exhibited simi- 
lar G + C content. For example, Roseiflexus -like organisms have 
similar G + C content (61%) to Synechococcus sp. strains A and B 7 
(Figure 2), yet these different genera are clearly separated based on 
differences in sequence character using oligonucleotide clustering 
analysis (Figure 4). 

A sequence cluster corresponding to Thermochromatium 
spp. (Gamma-proteobacteria) contained sequences solely from 



www.frontiersin.org 



June 2013 | Volume 4 | Article 106 | 5 



Klatt etal. 



High-temperature chlorophototrophic microbial mats 



Roseiflexus 
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BLVA • • 
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%G+C=64 
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Bacteria 
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CP Spirochaetes 
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# BLVA Green (BLVA_5) 

# BLVA Purple (BLVA_20) 
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O Chocolate Pots (CP_7) 
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0 Scaffold with marker gene 
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FIGURE 4 | Scaffold oligonucleotide frequency similarity network. 

Oligonucleotide (tri-, tetra-, penta-, and hexa-nucleotide) counts were 
normalized to scaffold length and subjected to k-means clustering {k = 8, 100 
trials). The scaffolds that group together in >90% trials are shown, with lines 



connecting scaffolds ranging from blue (90%) to red (100%). The sample 
origins of scaffolds shown here are indicated by site color (see legend) where 
open circles correspond to reference genomes; scaffolds containing 
phylogenetic or functional marker genes are indicated by larger nodes. 



Table 2 | Properties of scaffold clusters obtained from metagenome assemblies as demarcated with oligonucleotide composition and 
confirmed using phylogenetic analyses. 



Scaffold 


Taxonomic affiliation 


Sites 


No. of 


Median size 


G+C 


Total sequence 


Depth of 


cluster 






scaffolds 


(kbase) 


(%)±SD 


(Mbase) 


coverage (x) 


1 


Roseiflexus spp. 


BLVA_5, CP_7 MS_15, FG_16 


112 


12.5 


60.0±1.2 


1.55 


2.6±0.4 


2 


Chloroflexus spp. 


BLVA_5, BLVA_20, WC_6, CP_7 


211 


13.5 


54.3 ±1.2 


3.21 


2.9±0.7 


3 


Ca. Thermochlorobacter spp. 


CP_7 


73 


14.8 


49.5±0.8 


1.13 


2.7±0.5 


4 


Thermochromatium spp. 


BLVA_20 


29 


12.5 


63.0 ±1.3 


0.37 


2.1 ±0.4 


5 


Synechococcus spp. 


WC_6, CP_7 


78 


26.2 


58.7±1.1 


2.59 


4.0±0.7 


6 


Cyanobacteria 


WC_6, CP_7 


26 


11.7 


49.8±1.2 


0.32 


2.4±0.5 


7 


Bacteroidetes 


WC_6 


30 


11.1 


37.7 ±0.9 


0.37 


2.4±0.4 


8 


Chloroflexi-like 


BLVA_5, MS_15, BLVA_20 


37 


10.6 


63.9±2.3 


0.44 


2.5±0.5 


9 


Firmicute-like 


BLVA_5, CP_7, BLVA_20 


47 


14.2 


36.0±1.5 


0.79 


2.7±0.4 


10 


Firmicute-like 


BVLA_5, BLVA_20 


11 


12.7 


29.0±1.4 


0.16 


2.6±0.6 


11 


Spirochaetes 


CP_7 


21 


11.8 


30.5±1.4 


0.25 


2.3±0.4 



Scaffold clusters 7-11 represent novel bacteria that are not well represented in public databases, and are currently defined at the phylum level. 
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BLVA_20, which is consistent with visual evidence of this pop- 
ulation at the time of sampling (Figure 1), as well as further 
NWF PCA analysis using contigs >20kb (Figure A6 in Appen- 
dix). Other major sequence clusters identified included the "Ca. 
T. aerophilum"-like population from CP_7 (discussed above). 
Although relatives of the Bacteroidetes were found to occupy all 
sites, larger assemblies of several of these community members 
were obtained from WC_6. Three scaffold clusters with com- 
paratively low G + C content (<40%) were observed, but nei- 
ther AMPHORA (based on phylogenetic analysis) nor MEGAN 
("blastx" alignments) could classify the sequences in these groups. 
This suggested that they originated from organisms that are 
currently poorly represented in public databases. 

USE OF SINGLE-COPY GENES TO DEMARCATE DOMINANT 
POPULATIONS 

Phylogenetically informative single-copy genes were identified 
among the metagenome assemblies using AMPHORA (Wu and 



Eisen, 2008), and provided yet another method for evaluat- 
ing the predominant taxa represented in the six metagenomes. 
The distribution of dominant phylotypes predicted using 
AMPHORA (Figure 5A) was similar to that observed using 
the combined "blastx" and G + C (%) analyses of individual 
sequences (Figure 2), as well as to the taxonomic distribu- 
tion of PCR-based 16S rRNA gene libraries from these same 
sites (Figure 5B). Moreover, the distribution of predominant 
populations (e.g., Chloroflexi, Cyanobacteria, Chlorobi, Pro- 
teobacteria) across sites was consistent with detailed analy- 
sis of major oligonucleotide clusters (e.g., Figures 3 and 
4). All approaches showed that members of the Chlo- 
roflexi were ubiquitous across all sites. The relative contri- 
bution of Chloroflexus versus Roseiflexus -like organisms var- 
ied across different sites, and all sites contained novel organ- 
isms from undescribed lineages within the Chloroflexi (dis- 
cussed in greater detail below). Other phototrophs detected in 
these sites included populations of Alpha-proteobacteria (Family 
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FIGURE 5 | Phylogenetic summary of marker genes from metagenome 
sequences compared to 16S rRNA gene sequences. Phylogenetic marker 
genes in the metagenome sequences were (A) assigned and classified using 



AMPHORA, and compared to (B) 16S rRNA sequences from ribosomal panels 
(n ~ 300 per site) classified at the phylum-level against the RDP at a 
confidence threshold of 80%. 
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Hyphomicrobiaceae) in FG_16, "Ca. C. thermophilum" (phylum 
Acidobacteria) (Bryant et al., 2007) in WC_6, and "Ca. T. 
aerophilum" -like organisms (order Chlorobiales) in MS_15, 
FG_16 and especially CP_7 (Figure 5B). The MS_15 commu- 
nity contained a Thermotoga-like population as well as several low 
G + C organisms that have not yet been characterized. Although 
the subsurface mat community from FG_16 contained a novel 
high G + C proteobacterial population not seen in the other sites 
(Figure 2), these sequences could not be linked unambiguously to 
the Hyphomicrobiaceae 16S rRNA sequences described above, due 
to inadequate sequence coverage of this population and the lack 
of a good reference genome that would undoubtedly have assisted 
in sequence identification. 

The distribution of phylogenetically unique Chloroflexi-like 
16S rRNA gene sequences across sites was compared to the abun- 
dance of Chloroflexi marker genes in the metagenome assem- 
blies identified using AMPHORA (Figure 6). The majority of 
Chloroflexi-like 16S rRNA sequences were most similar to either 
Chloroflexus or Roseiflexus spp.; however, many sequences fell 
outside of the family Chloroflexaceae and grouped with other 
members of the Chloroflexi that are not known to exhibit pho- 
totrophy (Figure 6). Additionally, Roseiflexus -like populations 
from MS_15, CP_7, and FG_16 and Chloroflexus -like populations 



from BLVA and WC_6 each formed monophyletic groups that 
excluded sequences from all other springs (Figure A7 in Appen- 
dix). Other spring-specific clades were observed for sequences 
from FG_16 within the class Anaerolineae, a group of Chloroflexi 
that was very recently shown to contain photo trophic mem- 
bers (Klatt et al., 2011). The presence of these 16S rRNA gene 
sequences, combined with observed Chloroflexi-like photosyn- 
thesis genes associated with these populations, suggests that these 
undescribed Chloroflexi may also contribute to phototrophy in 
these mat communities. 

FUNCTIONAL ANALYSIS OF PREDOMINANT SEQUENCE ASSEMBLIES 

Carbon fixation 

The gene content of major scaffold clusters provides a basis for 
inferring the possible metabolic functions of dominant popu- 
lations present in these communities (Table 3). For example, 
genes encoding key enzymes involved in the 3-hydroxypropionate 
(3-HP) pathway of inorganic carbon fixation were present 
in the metagenomes from all six sites, and were associated 
with the predominant Chloroflexus and Roseiflexus-like popu- 
lations present in these habitats. Genes coding for subunits of 
ribulose 1,5-bisphosphate carboxylase-oxygenase (RuBisCO), a 
key enzyme in the reductive pentose phosphate pathway (i.e., 
Calvin-Benson-Bassham cycle) were observed only in cyanobac- 
terial (WC_6 and CP_7) or proteobacterial sequences (alpha- 
pro teobacteria and Thermo chromatium spp. in FG_16 and 
BLVA_20, respectively). No CO2 fixation genes were associated 
with the sequences derived from the "Ca. T. aerophilum" -like pop- 
ulations from CP_7, despite the fact that other cultivated members 
of this phylum are capable of fixing CO2 via the reductive tri- 
carboxylic acid (rTCA) cycle. The average coverage of "Ca. T. 
aerophilum"- assemblies (~3 x ) may not be sufficient to conclude 
that these Chlorobi definitively lack the capacity to fix inorganic 
carbon, however, metatranscriptomic studies with much deeper 
coverage also failed to identify key genes (i.e., ATP-citrate lyase) 
of the rTCA cycle in these populations at Mushroom Spring (Liu 
et al., 2012). This organism is a member of a novel, family level 
lineage of the Chlorobi, which are predicted to be aerobic pho- 
toheterotrophs that cannot oxidize sulfur compounds, cannot fix 
N2, and do not fix CO2 autotrophically (Liu et al., 2012). 

Chlorophototrophy 

Genes involved in (bacterio) chlorophyll biosynthesis and the 
production of photosynthetic reaction centers (here termed 
chlorophototrophy genes) were present in scaffold clusters cor- 
responding to Roseiflexus, Chloroflexus, Thermo chromatium, and 
Synechococcus spp., as well as the "Ca. T. aerophilum" -like pop- 
ulation in CP_7, and other Cyanobacteria, especially in WC_6 
(Table 3). Consequently, the dominant phototrophs within each 
community exhibit genomic capability for chlorophototrophic 
metabolism. Examination of shorter (<10kbp) scaffolds revealed 
additional genes involved in chlorophototrophy, and these were 
assigned to specific chlorophototrophic organisms such as "Ca. 
Chloracidobacterium spp." present in WC_6, and uncultivated 
pro teobacteria in the FG_16 subsurface mat community (Table 3). 
The high G + C% proteobacterial sequences from FG_16 aver- 
aged 74% identity (AA) to Rhodopseudomonas palustris and other 
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FIGURE 6 I Comparison of Chloroflexi phylogenetic marker genes from 
metagenomes and Chloroflexi 16S rRNA clones. Phylogenetic marker 
genes within the metagenome sequences assigned to the phylum 
Chloroflexi using AMPHORA (A) compared to the identity (confidence 
threshold of 80%) of Chloroflexi-like 16S rRNA genes (B) observed in the 
ribosomal clone library (n ~ 300 per site). Taxonomic groups of Chloroflexi: 
red = Roseiflexus spp., green = Chloroflexus spp., brown shades = other 
taxa within the order Chloroflexales, and yellow shades = other taxa within 
phylum Chloroflexi. 
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Table 3 | Phylogenetic distribution of autotrophic, phototrophic, and sulfur cycling genes in metagenomes. 



Phylogenetic Group 


BLVA Green 


BLVA Purple 


White Creek 


Chocolate Pots 


Mushroom 


Fairy Geyser 




(BVLA_5) 


(BVLA_20) 


(WC_6) 


(CP_7) 


Spring (MS_15) 


(FG_16) 




CARBON FIXATION PATHWAYS 


Roseiflexus spp. 


0.60 


0.80 


0.20 


0.90 


0.60 


0.80 


Chloroflexus spp. 


1.00 


1.00 


0.80 








Other Chloroflexi 








0.20 




0.50 


Cyanobacteria 






0.20 


0.60 






Thermochromatium spp. 




0.67 










Alpha-proteobacteria 












0.67 




Roseiflexus spp. 


0.76 


0.57 


0.14 


0.86 


0.57 


0.76 


Chloroflexus spp. 


1.00 


0.90 


0.76 




0.19 


0.05 


Other Chloroflexi 






0.14 


0.14 


0.14 


0.43 


Thermochromatium spp. 




0.13 










Cyanobacteria 






0.45 


1.36 


0.27 


0.09 


Ca. Thermochlorobacter spp. 






0.08 


0.83 


0.08 


0.25 


Ca. Chloracidobacterium spp. 






0.23 




0.05 




Alpha-proteobacteria 












0.25 


Roseiflexus spp. 


0.50 


0.25 




0.50 




0.50 


Chloroflexus spp. 


0.40 


0.40 


0.20 








Cyanobacteria 






0.33 


0.97 


0.52 


1.00 


Thermochromatium spp. 




0.80 











Entries represent relative completeness of indicated pathways calculated as the fraction of a unique occurrence of a gene in a taxon divided by the total number of genes 
known to be involved in that function (values > 0.5 are in bold). Metagenome sequences were compared to known pathways in the genome sequences of Chloroflexus 
aurantiacus J-10-fl, Roseiflexus sp. strain RS-1, Thermochromatium spp., Allochromatium vinosum, Synechococcus sp. strain A, "Candidatus Chloracidobacterium 
thermophilum" Chloroherpeton thalassium, and the alpha-proteobacterium, Rhodopseudomonas palustris TIE-1. 



alpha-proteobacterial genomes, and are likely contributed from 
the Hyphomicrobiaceae population in FG_16. Genes from Chlo- 
roflexi coding for chlorophototrophic functions, but too divergent 
to originate from either Chloroflexus or Roseiflexus spp. (i.e., only 
~70% AA ID), were present in all non-sulfidic sites, especially in 
FG_16 (Table 3). The Chlororlexi-like chlorophototrophy genes 
from FG_16 are phylogenetically distinct ( < 70% AA ID) from pre- 
viously described metagenome sequences and all related sequences 
residing in public databases, indicating that novel uncultured pho- 
totrophic members of the Chloroflexi inhabit the mats at Fairy 
Geyser. Three deduced protein sequences from the subsurface 
layer in Mushroom Spring (MS_15) were highly similar (96-100% 
AA ID) to translated sequences of novel chlorophototrophy genes 
observed in recent "meta-omic" studies of the top-layers of this 
same mat type (Klatt et al., 2011; Liu et al., 2011); these obser- 
vations linked these genes to a group within the Chloroflexi not 
previously known to contain chlorophototrophic organisms. 

Iron oxidation 

One goal of this study was to investigate the role of anoxy- 
genic photosynthesis in sulfidic communities from Bath Lake Vista 
Annex and in iron mats at Chocolate Pots. Previous studies near the 
source of Chocolate Pots (and near CP_7) have shown that the oxi- 
dation of aqueous Fe(II) is abiotic, but mediated by the production 
of oxygen by cyanobacteria (Pierson et al., 1999; Trouwborst et al., 



2007). However, voltammetric microelectrode studies revealed 
that Fe(II) persists in deeper layers of the mat, providing a potential 
niche for anoxygenic phototrophs that can use Fe(II) as an electron 
donor for photosynthesis (photo ferrotrophy) (Trouwborst et al., 
2007). Query genes for both sulfur and Fe(II) oxidation (Croal 
et al, 2007; Jiao and Newman, 2007; Frigaard and Dahl, 2009; 
Grimm et al, 201 1; Bryant et al, 2012) were used to search for evi- 
dence of sulfide or Fe(II) oxidation in the community from CP_7. 
No genes with significant similarity to the photosynthetic iron oxi- 
dation (pio) operon of the purple non-sulfur Rhodopseudomonas 
palustris TIE-1 (Jiao and Newman, 2007) or the fox operon of 
the purple non-sulfur Rhodobacter ferrooxidans SW2 (Croal et al., 
2007) were observed in CP_7, or any site described in this study 
with the exception of one sequence in FG_16, a site that con- 
tains below detectable levels of iron (Table 1). This result concurs 
with the low numbers of alpha-proteobacterial sequences in CP_7 
(Table 3), and the lack of Fe(II) oxidation when similar mats were 
illuminated with near- infrared radiation to excite bacteriochloro- 
phylls (Trouwborst et al., 2007). To date, no thermophilic represen- 
tatives of purple and green photoferrotrophs have been discovered. 

Sulfur oxidation 

Genes known to encode proteins involved in sulfur oxida- 
tion (dsr complex) in some anoxygenic phototrophs (e.g., 
gammaproteobacterium Allochromatium vinosum, Dahl et al., 
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2005; Frigaard and Dahl, 2009; Gregersen et al., 2011) were iden- 
tified in the Thermochromatium-like population from BLVA_20, 
and this is consistent with the high concentrations of DS 
(>100|xM) measured in situ. However, the dominant Chlo- 
roflexus -like populations observed in both BLVA samples do not 
contain dsr or sox genes known to be involved in the oxidation 
of reduced-sulfur compounds. This is consistent with the absence 
of these same genes in reference Chloroflexus and Roseiflexus spp. 
genomes (van der Meer et al., 2010; Tang et al., 2011). However, 
the Chloroflexus assemblies from BLVA_20 and Roseiflexus assem- 
blies of CP_7 (as well as FAP reference genomes) contain sqr genes, 
which encode sulfide-quinone oxidoreductases and have been sug- 
gested to play a role in the oxidation of sulfide to elemental sulfur 
in multiple bacterial phyla (Griesbeck et al, 2002; Chan et al, 
2009; Marcia et al., 2009). Consequently, it is possible that proteins 
encoded by sqr genes may enable FAPs to obtain electrons from 
reduced- sulfur compounds (Frigaard and Dahl, 2009; Gregersen 
et al., 201 1; Bryant et al., 2012). In the current study, the presence 
of similar Chloroflexus as well as similar Roseiflexus populations 
across both sulfidic and non-sulfidic sites argues that utilization of 
sulfide as an electron source is not an obligate physiological trait 
across these genera. 

Anaerobic metabolism 

Sequence clusters corresponding to undescribed organisms from 
the Bacteroidetes show no evidence of chlorophototrophy, but 
rather contain genes suggestive of anaerobic metabolism(s). 
Protein-coding genes involved in the oxidation and/or fermen- 
tation of organic acids were noted in several sites. For exam- 
ple, acyl-CoA synthetases and lactate dehydrogenases were found 
in unidentified clusters from BLVA (G + C = 64%) and CP_7 
(G + C = 31%) and a mixed cluster containing sequences from 
BLVA and CP (G + C = 36%). Subunits of a pyruvate ferre- 
doxin: oxidoreductase (PFOR) were found in both unidentified 
BLVA clusters. Although important in every mat type, insufficient 
coverage of the less-dominant anaerobic populations present in 
chlorophototrophic mats precludes a thorough analysis of their 
metabolic potential. 

COMPARATIVE ANALYSIS OF PROTEIN FAMILIES 

A complete functional analysis was performed (using multivari- 
ate statistical analysis) by assigning TIGRFAM protein families 
to predicted proteins within all metagenome assemblies. Differ- 
ences in gene contents among the six chlorophototrophic mats 
should be indicative of changes in community structure and 
the corresponding functional attributes of dominant commu- 
nity members. PCA was used to examine the relative differences 
among sites based on all TIGRFAM categories (Figure 7). Factor 
1 (PCI, accounting for ~41% of the relative functional variation 
across sites) separates subsurface from surface mat communities, 
while PC2 (~27% of variation) separates the sites according to 
different levels of oxygen (or sulfide) and the presence of oxy- 
genic phototrophs. Factor 3 (PC3, ~17% of variation) emphasizes 
functional similarities between MS_15 and WC_6 that are diffi- 
cult to separate based only an examination of the abundance of 
different phylotypes across these sites (e.g., Figure 2). For exam- 
ple, although both sites contained cyanobacteria (e.g., low sulfide), 



MS_15 contained more sequences related to Roseiflexus spp., while 
WC_6 contained numerous Chloroflexus-like sequences. These 
populations may be organotrophic in this environment and not 
dependent on sulfide or elemental sulfur (Table 1; Figure 6). 

Specific TIGRFAM categories responsible for differences across 
sites were also evaluated using hierarchical cluster analysis. Two 
approaches were evaluated using either a smaller set of TIGR- 
FAM categories related to "energy metabolism" (Figure 8) or all 
TIGRFAM families (Figure A8 in Appendix). In each case, com- 
munities (sites) clustered as expected based on replication of spe- 
cific variables such as sulfide/oxygen, temperature, and mat sample 
depth (Inskeep et al, 2013). The relative abundance of TIGRFAMs 
associated with "energy metabolism" was evaluated and included 
genes related to sugar degradation, glycolysis/gluconeogenesis, 
pentose phosphate pathway, fermentative processes, electron 
transport, and chemolithoautotrophy (Figure 8). Site clustering 
using these TIGRFAMs confirmed greater metabolic potential for 
processes such as aerobic metabolism and oxygenic photosynthesis 
in CP_7 and WC_6, samples that contained the most cyanobacteria 
(e.g., Synechococcus, Fischerella). Conversely, the subsurface mat 
communities (FG_16 and MS_15) exhibited a greater abundance 
of genes related to the Entner-Doudoroff pathway and fermen- 
tative processes, which are expected to be more important in 
subsurface environments occurring just below the predominant 
cyanobacterial populations (See Materials and Methods). Relative 
abundance within the TIGRFAM category "aerobic metabolism" 
revealed greater numbers of these genes in sites that contained sig- 
nificant levels of dissolved oxygen (i.e., no DS) compared to sulfidic 
sites (BLVA_5, 20). Moreover, TIGRFAMs associated with "anaer- 
obic metabolism" as well as "chemoautotrophy" were higher in the 
sulfidic sites (BLVA sites 5 and 20) (Figure 8), although some of 
these TIGRFAMs are also present in subsurface mat communities. 
As should be clear, specific inferences on the basis of a TIGRFAM 
assignment must be followed with further analysis of the specific 
gene or set of genes responsible for the abundance estimates within 
a category. 

Hierarchical cluster analysis across all TIGRFAMs grouped into 
52 functional categories showed generally similar results regard- 
ing site clustering, but the number of TIGRFAM categories used 
in the analysis precludes a full description of all protein fami- 
lies (Figure A8 in Appendix). Based on clear differences in the 
phylotypes observed in sulfidic (hypoxic) vs. oxic samples, the 
TIGRFAM abundance profiles from BLVA (sites 5 and 20), and 
those from CP_7 and WC_6 formed separate clusters as expected. 
However, relative TIGRFAM abundance profiles of the subsurface 
mat communities (FG_16 and MS_15) did not form a separate 
cluster, as these sites simply do not exhibit greater similarity to 
one another compared to similarity among all sites (e.g., organ- 
isms similar to Roseiflexus spp. are present in all sites). Despite 
similarities in physical context, the two subsurface communities 
(MS_15, FG_16) revealed different functional signatures con- 
sistent with substantial differences in community composition 
described above (Figure 2), and that are likely due to differences in 
geochemistry and temperature between the two samples (FG_16 
is ~15°C cooler than MS_15 and exhibits higher pH values, above 
pH 9). Consequently, the functional profiles across all TIGRFAM 
groupings are consistent with, and provide further support for, the 
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FIGURE 7 | Principal components analysis (PCA) of relative 
gene abundances (TIGRFAMs) across six phototrophic sites. 

Principal components (PC1, PC2, PC3) obtained across all 
TIGRFAMs grouped into functional categories (also see Figure A8 in 



Appendix for hierarchal cluster analysis). Site-pairs are circled based 
on separation achieved with PC1 and/or PC2 (BLVA_5 = fuschia, 
BLVA_20 = purple, WC_6 = light-blue, CP_7 = gold-brown, 
MS_15 = green, FG_16 = red). 



differences in community structure between MS_15 and FG_16 
(Figure A8 in Appendix). 

DISCUSSION 

The six sites investigated in this study are representative of 
three general types of geothermal springs in Yellowstone National 
Park that support bacterial chlorophototrophic communities and 
include (i) alkaline-siliceous chloride springs (pH 7.5-9; e.g., 
WC_6, MS_15, and FG_16), (ii) sulfidic- carbonate springs (pH 
6-7; e.g., BLVA_5 and BLVA_20), and (hi) mildly acidic (pH 6) 
non-sulfidic springs containing high aqueous Fe(II) (e.g., CP_7) 
(Rowe et al., 1973; McClesky et al., 2005). The major physical 
and geochemical constraints that have been postulated to con- 
trol the distribution of phototrophs (and photosynthesis) in these 
thermal springs are pH, temperature, sulfide concentration, and 
gradients in light and/or other chemicals existing as a function 
of mat depth (Brock, 1967, 1978; Cox et al, 2011; Boyd et al, 
2012). The upper temperature limit of cyanobacterial photosyn- 
thesis is known to occur at ~74°C (Brock, 1973), and the grazing of 



these microbial mats by eukaryotic organisms typically only occurs 
at temperatures below 50°C. Most springs that support bacterial 
chlorophototrophic mats occur at pH > 5, with rare exceptions 
such as the acid-tolerant, purple non-sulfur phototrophs related 
to Rhodopila sp. observed in Nymph Lake (YNP) and in small 
sulfidic, acidic (pH 3.5-4.5) springs near the Gibbon River (Pfen- 
nig, 1974; Madigan et al, 2005). The bulk aqueous pH at CP_7 
is near the lower limit observed for thermophilic cyanobacteria 
(Brock, 1973), and microelectrode measurements of the CP_7 mat 
revealed that it was constantly flushed by vent water with a pH ~ 6 
(Trouwborst et al., 2007). Even at pH 6, CP_7 supports an active 
community of cyanobacteria that are similar to Synechococcus sp. 
Belike populations observed in Mushroom and Octopus Spring 
(pH > 8) phototrophic mats (Figure Al in Appendix). 

DISTRIBUTION OF AN0XYGENIC PHOTOTROPHS 

Anoxygenic chlorophototrophs are known to colonize sulfidic 
springs of YNP (van Niel and Thayer, 1930; Castenholz, 1969, 
1977; Madigan, 1984; Giovannoni et al, 1987), and this was 



www.frontiersin.org 



June 2013 | Volume 4 | Article 106 | 11 



Klatt etal. 



High-temperature chlorophototrophic microbial mats 



-1.5 0.0 1.5 



Phototrophic 



to m o m n- co 

T- T- cm 

< < 
O £ 3 3 Q.CJ 
l2cdcqos 




-1.37 
-0.19 
1.0 

Methanogenesis 
Sugars 

Chemoautotrophy 
Anaerobic 

ATP-proton motive force 
Glycolysis/gluconeogenesis 
Photosynthesis 
Pentose phosphate pathway 
Pyruvate dehydrogenase 
Amino acids and amines 
TCA cycle 
Aerobic 
Fermentation 
Entner-Doudoroff 
Electron transfer 

Polysaccharide biosyn./degradation 



FIGURE 8 | Hierarchical cluster analysis of relative abundances of 
genes inTIGRFAMs associated with "Energy Metabolism" classified by 
functional category. Data was standardized by functional category before 
clustering to avoid biasing analysis by a few categories with high gene 
abundance. Pearson correlation was used as the distance measure for 
average linkage agglomerative clustering. 



confirmed in samples from BLVA in which concentrations of 
DS exceeded 100 |xM. However, the only population with genes 
supporting a complete, well-studied sulfide -oxidization pathway 
(Dahl et al., 2005) was the Thermo chromatium-\ike organisms 
present in BLVA_20. The other prominent anoxygenic chloropho- 
totrophs included populations of Chloroflexus and Roseiflexus -like 
spp. (identified across all sites). The abundance of chloropho- 
totrophic Chloroflexi across sites is reflective of their previously 
established physiological diversity, including photoheterotrophy 
with organic acids such as acetate and propionate, photoautotro- 
phy, photomixotrophy, and oxic and anoxic chemoorganotrophy 
(Madigan et al, 1974; Pierson and Castenholz, 1974; Giovan- 
noni et al, 1987; Hanada et al, 2002; van der Meer et al., 2003, 
2010; Zarzycki and Fuchs, 2011). While these organisms are gen- 
erally photoheterotrophic, their metabolic flexibility contributes 
in part to their ability to colonize a broad spectrum of slightly 
acidic to neutral pH environments at 50-70°C (Castenholz and 
Pierson, 1995). Highly similar (>98% average NT ID) Rosei- 
flexus-\ike organisms were abundant in all sites, independent 
of bulk sulfide concentration. Moreover, Chloroflexus -like pop- 
ulations were found in both sulfidic (BLVA) and oxic systems 
(WC_6). The presence of Roseiflexus spp. sequences in BLVA_5 
and _20 and the larger proportion of Chloroflexus spp. in WC_6 
compared to Roseiflexus spp. was unexpected, as it has been 
shown that Chloroflexus spp. tolerate higher levels of sulfide in 



culture (Madigan et al., 1974; Giovannoni et al., 1987; van der 
Meer et al, 2010). These results suggest that sulfide concentra- 
tion is not a deterministic variable explaining niche partitioning 
between Chloroflexus spp. and Roseiflexus spp. This inconsis- 
tency with expected distribution patterns implies that factors 
other than sulfide and/or oxygen are important in controlling the 
relative abundance of Chloroflexus and Roseiflexus spp. in YNP 
phototrophic mat environments. Finally, sequences assigned to 
"Ca. C. thermophilum" (phylum Acidobacteria) (Bryant et al., 
2007) were most abundant in the oxic communities of WC_6 
and MS_15 (~8 and 3% of sequences, respectively). Although 
small numbers of sequences (<1%) assigned to this organ- 
ism (BLASTN, >50% NT ID) were observed in other sites, 
genes encoding enzymes of (B)Chl biosynthesis and belonging 
to "Ca. C. thermophilum" were only found in WC_6 and MS_15 
(Table 3). 

The observed differences in functional gene content between 
the two subsurface mat communities (MS_15 and FG_16) were of 
further interest, in part due to the presence of different poorly 
understood organisms in both sites. "Red-layer" communities 
(FG_16) have been shown to contain novel phototrophs (Boomer 
et al., 2000, 2002), whose pigments exhibit unusual in vivo absorp- 
tion spectra (Boomer et al., 2000). Indeed, the FG_16 sample 
contained a high G + C (~68-70%) alpha-proteobacterial pop- 
ulation not observed in any other site (Figure 2). The 16S rRNA 
sequences from FG_16 indicated the presence of an alphapro- 
teobacterium (family Hyphomicrobiaceae), some members of 
which are known to produce BChl b (Hiraishi, 1997). BChl b 
pigments were detected in solvent-based extractions from Fairy 
Geyser mat samples (M. Pagel and D. A. Bryant, unpublished data) 
and suggest that the phototrophs producing these pigments may 
exhibit light-harvesting properties that differ from those of other 
chlorophototroph populations in the mats. 

Differences in community composition between the two sub- 
surface mat communities maybe driven by differences in tempera- 
ture (60 vs. 36-40°C in MS_15 and FG_16, respectively). However, 
the MS_15 subsurface community was also distinct from surface 
(top 1-2 mm) communities sampled from the same mats at the 
same temperature (Klatt et al., 201 1). For example, the abundance 
of Thermotoga spp. in the subsurface communities may be dri- 
ven primarily by lower oxygen levels shown to exist 2 mm below 
the mat surface (Jensen et al., 2011) and is consistent with their 
physiology as microaerophilic heterotrophs (van Ooteghem et al., 
2004). Anaerobic fermentation by Thermotoga spp. could consti- 
tute a major source of H2 that could enable photomixotrophic 
metabolism by Chloroflexus and Roseiflexus spp. (Klatt et al., 
2013). Moreover, compared to the phototrophic surface layers 
of these mats, MS_15 subsurface communities contained fewer 
Synechococcus spp., greater Roseiflexus spp., and greater num- 
bers of likely anaerobic or fermentative organisms within the 
Bacteroidetes and Thermodesulfobacteria. 

TROPHIC INTERACTIONS 

Trophic interactions between FAPs and cyanobacteria have been 
studied in phototrophic geothermal mats, and it has been shown 
that photoheterotrophs (FAPs) utilize organic acids produced 
by autotrophic cyanobacteria (Anderson et al., 1987; Nold and 
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Ward, 1996; van der Meer et al., 2005). Moreover, it has been 
proposed that Thermo chromatium spp. (purple-sulfur bacteria) 
are primary producers in sulfidic springs and cross-feed low- 
molecular weight organic acids to FAPs (Madigan et al., 1989, 
2005). This is analogous to the cyanobacterial primary produc- 
tion and trophic interactions documented to occur in Octopus 
Spring and Mushroom Spring (van der Meer et al., 2005). How- 
ever, this hypothesis is not supported by the relatively heavy 
carbon isotope composition of Chloroflexaceae-specific lipid bio- 
markers in sulfidic springs (o 13 C=— 8.9 to —18.5 %o, van der 
Meer et al., 2003). These isotopic compositions have been inter- 
preted to be too heavy to originate from compounds cross-fed 
from Thermo chromatium spp., which use the Calvin-Benson- 
Bassham cycle for carbon dioxide fixation ($ 13 C= —20 to —35 
%o). The lipid signatures are more readily explained by direct 
carbon dioxide fixation by Chloroflexus and Roseiflexus spp. via 
the 3-HP pathway (Holo and Sirevag, 1986; Strauss and Fuchs, 
1993; van der Meer et al, 2000, 2010). Metagenome sequence 
assemblies obtained in the current study showed that these uncul- 
tivated Chloroflexus and Roseiflexus spp. contained all genes nec- 
essary for CO2 fixation via the 3-HP pathway (Table 3), and is 
consistent with earlier evidence at BLVA of short-term, sulfide- 
stimulated 14 C02 incorporation by FAPs (Giovannoni et al., 
1987). Collectively, these observations support the hypothesis that 
all major chlorophototrophs contribute to primary productivity 
in sulfidic- carbonate springs (Table 3). It remains to be deter- 
mined whether FAPs are more important contributors to primary 
productivity in these systems when purple-sulfur bacteria (i.e., 
Thermo chromatium) and cyanobacteria are both absent (such as 
observed in BLVA_5). 

This study highlights several of the major differences in 
community composition and structure, and potential func- 
tion of chlorophototrophic microbial mats sampled from high- 
temperature systems (40-60°C) containing high sulfide, high 
Fe(II), or high dissolved oxygen. The distribution of chloropho- 
totrophic organisms, as would be expected, is dependent on the 
presence or absence of high sulfide (cyanobacteria, purple-sulfur 
bacteria), and position within laminated mats (e.g., FAPs, Bac- 
teroidetes, and Firmicutes). Temperature was not particularly well 
constrained as a consistent parameter for comparisons across the 
sites included in this study. However, the ubiquity of Chloroflexus 
and Roseiflexus spp. across all sites emphasizes their ability to toler- 
ate large differences in not only temperature, but extremes between 
high and low levels of DS and/or oxygen. Assemblies of a novel 
Chlorobi population ("Ca. T. aerophilum") from the high iron 
site at Chocolate Pots (CP_7) were similar to those obtained from 
Mushroom Spring and Octopus Spring (Liu et al., 2012). These pop- 
ulations deserve further study, especially considering their phylo- 
genetic distance and different functional attributes compared to 
other currently described members of the Chlorobi. The dominant 
cyanobacteria observed across these sites (found exclusively in 
non-sulfidic systems) included Synechococcus spp. (CP_7, MS_15) 
and Fischerella (Mastigocladus) spp. (WC_6). Consequently, sul- 
fide is a critical geochemical variable that selects against the pres- 
ence of cyanobacteria and provides niche opportunities for other 
chlorophotoautotrophs. Other poorly represented organisms in 
the current study include bacteria from the phyla Firmicutes and 



Bacteroidetes, and although the assemblies for organisms within 
these phyla were not particularly large, a sufficient number of 
genes were found to infer that their role in these communi- 
ties may involve fermentation and the degradation of complex 
carbon compounds. Additional sequence assembly and/or isola- 
tion of these populations, coupled with site-specific studies, are 
necessary to clarify the important carbon cycling functions that 
these populations conduct and the processes that drive inter- 
actions among primary producers and secondary consumers in 
chlorophototrophic mats. 

MATERIALS AND METHODS 

SAMPLE COLLECTION AND GEOCHEMICAL ANALYSES 

Six different samples were obtained from five hot springs between 
August 2007 and May 2008 (Table 1; Table S2 in Inskeep et al, 
2013) and immediately frozen in liquid N2. Photo trophic mats 
were sampled at different locations relative to the source of each 
respective spring, and two samples were obtained from subsurface 
matlayers [Mushroom Spring (MS_15) and Fairy Geyser (FG_16)]. 
The subsurface layers were obtained by careful removal of the top 
2 mm green layer with a sterile scalpel and separation of a defini- 
tive under-layer in each mat type (e.g., Boomer et al., 2000, 2002; 
Niibel et al., 2002). Geochemical characterization was performed 
on bulk spring water at each sampling location after filtration 
(0.2 |xm). Total dissolved ions were determined using inductively 
coupled plasma spectrometry and major anions determined using 
ion chromatography as described previously (Macur et al., 2004; 
Inskeep et al., 2005). Temperature, pH, total DS, total soluble Fe, 
and dissolved oxygen were determined immediately in the field. 
Dissolved gases (CO2, CH4, and H2) were determined using head- 
space gas chromatography of filtered field samples (Inskeep et al., 
2005). 

DNA EXTRACTION AND PREPARATION 

Environmental DNA was extracted as described in Inskeep et al. 
(2013). Briefly, 0.5-1 g of frozen mat samples were processed using 
separate parallel DNA extractions with an enzymatic method 
(Proteinase K (lmg/ml) with Na-dodecyl sulfate (SDS) (0.3% 
w/v) for 0.5 h at 37°C) and a mechanical method (bead-beating 
with 2% w/v SDS and 15% v/v Tris-HCl-equilibrated phe- 
nol, shaken at 5.5 m/s for 30 s) for cell lysis. The resulting cell 
lystates were pooled and subsequent DNA extractions were per- 
formed with phenol:chloroform:isoamyl alcohol (25:24:1), and 
chloroform:isoamyl alcohol (24:1). This procedure removed DNA 
extraction bias that has been shown to occur when only mechani- 
cal or enzymatic protocols are used for cell lysis (Klatt et al., 2007, 
201 1). All samples were treated with RNAse I (Promega, Madison, 
WI, USA), and DNA was precipitated with ethanol and Na-acetate. 
Small-insert (3kb) metagenome libraries were constructed as 
described in Inskeep et al. (2013). About 820 bp was sequenced at 
each end of the inserts in the library clones, which produced pairs 
of linked sequences (424,982 sequences) that represented a total 
dataset of ~320.6Mbp. Ribosomal (16S rRNA) gene sequence 
libraries were constructed by PCR amplification using univer- 
sal primers targeting domains Archaea (4aF, TCCGGTTGATC- 
CTGCCRG; 1391R, GACGGGCRGTGWGTRCA) and Bacteria 
(27F, AGAGTTTGATCCTGGCTCAG and 1391R). Amplicons 
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were cloned using the TOPO TA Cloning Kit (Invitrogen, Carlsbad 
CA USA) and sequenced using Big Dye v3.1 chemistry (Applied 
Biosystems, Foster City, CA, USA). 

PRE-ASSEMBLY METAGENOME SEQUENCE ANALYSES 

All metagenome sequences were used as queries in a "blastx" 
(Camacho et al., 2009) search against the NCBI nr database 
(accessed 22 March 2011) with default parameters. The results 
were parsed and visualized with the MEGAN software version 2.3.2 
(Huson et al., 2007) with the default parameters (MinScore = 35.0, 
TopPercent = 10.0, MinSupport = 5) and taxonomic assignments 
of the top "blastx" matches were extracted. Comparative analysis 
was also completed using several relevant reference genomes avail- 
able after this date (e.g., Fischerella sp. and "Ca. T. aerophilum"; 
Liu et al.,2012). 

SEQUENCE ASSEMBLY AND ANNOTATION 

Metagenomic scaffolds of overlapping end sequences were con- 
structed separately for each of the six samples using the Cel- 
era assembler (Miller et al., 2008; Inskeep et al., 2013). This 
resulted in 206,469 scaffolds containing 183.2 Mbp (27-33 Mbp 
per site) of assembled sequence, or a 57% compression of the 
raw sequence data. The DOE-JGI annotation pipeline was used 
as an initial step for inferring functions for predicted ORFs on 
metagenome scaffolds, and included open reading frame (ORF) 
prediction, BLAST alignments, and hidden Markov model analy- 
sis (Mavromatis et al., 2009). Translated peptide sequences from 
predicted ORFs were analyzed with the AMPHORA package 
(Wu and Eisen, 2008), which identified homologs of 31 differ- 
ent genes (mostly predicted to encode ribosomal proteins or 
enzymes with housekeeping functions) that could be used as 
phylogenetic markers. Genes encoding particular functions were 
identified by BLASTP using reference sequences as queries, with 
the additional requirement that candidate sequences had a top 
BLASTP match to a sequence with the same annotated function 
in the NCBI nr database. All annotated metagenome sequence 
assemblies (Celera/PGA) discussed in the current manuscript 
are available through the DOE-JGI IMG/M (Markowitz et al, 
2012) website (http://img.jgi.doe.gov/rn) under IMG taxon OID 
numbers as follows: YNPSite06 (2022920004/2013515000), Site07 
(2022920013/2014031006), Sitel5 (2022920016/2015219002), 
Site 16 (2022920018/2016842003), Site05 (2022920003/201395 
4000),Site20 (2022920020/2016842008), and Site 17 (2022920021/ 
2016842005). 

RIBOSOMAL RNA SEQUENCE ANALYSES 

All bacterial 16S rRNA sequences from the 16S rRNA- specific 
PCR clone libraries were aligned and screened for chimeras 
with Bellerophon (Huber et al., 2004) with subsequent man- 
ual curation. OTUs were determined using the CAP3 assembler 
(Huang and Madan, 1999) at the 99% demarcation level. Rarefac- 
tion curves were determined, and the Chaol and ACE richness 
indexes and the Fisher's alpha, Shannon -Weaver, and Simpsons 
diversity indexes were calculated for each library (EcoSim ver- 
sion 7.0, Gotelli and Entsminger, 2001; Estimates v. 8.0, Colwell, 
2009). The RDP Bayesian Classifier (Wang et al, 2007) was used 
to assign taxonomy to 16S rRNA sequences at the 80% confi- 
dence level (Figures 5B and 6B), and all sequences belonging 



to the Chloroflexi were aligned with reference sequences corre- 
sponding to Escherichia coli positions 29-1349 (1321 positions). 
Alignments were masked with bacterial complexity filters in ARB 
(Ludwig et al., 2004). A phylogenetic tree was produced using 
the BioNJ algorithm (Gascuel, 1997) (Figure 2) and bootstrapped 
with 1000 replicates. Reference sequences shorter than the ini- 
tial alignment were subsequently added to the tree using the ARB 
parsimony tool. Consensus maximum-likelihood trees were pro- 
duced from 1000 replicate trees using RaxML (Stamatakis, 2006). 
A maximum-likelihood tree based upon amino acid alignments 
of PscD sequences was constructed using PhyML (Guindon et al., 
2010). 

STATISTICAL ANALYSES 

A distance matrix of environmental variables was constructed by 
calculating Gower coefficients using the R statistical environment 
(R Development Core Team, 2012). The Gower coefficient allows 
for different data types (qualitative presence/absence vs. quantita- 
tive numerical) with different dimensional scales to be combined 
into a general dissimilarity metric (Gower, 1971). Geochemical 
variables were treated as factors and were correlated to this distance 
matrix using the envfit function of the vegan package (Oksanen 
et al., 2012). Metagenomic scaffolds larger than lOkbp were sub- 
jected to analysis using oligonucleotide composition. All possible 
tri-, tetra-, penta-, and hexanucleotides were counted with cus- 
tom perl scripts, and normalized to the length of the scaffold. 
Normalized oligonucleotide composition matrices were subjected 
to k-means clustering with a range of k = 4-12 with 100 trials 
each. Clusters were reported when at least 10 scaffolds grouped 
together in 90% or greater Monte-Carlo simulations. The com- 
posite summary of these k-means trials was displayed as an inter- 
action network using the program Cytoscape 2.8.1 (Shannon et al, 
2003). 

BROAD FUNCTIONAL ANALYSIS OF METAGENOME SEQUENCES 

Assembled sequence from each of the phototrophic sites was 
annotated as described in Inskeep et al. (2010) and predicted pro- 
teins from the scaffolds were assigned TIGRFAM protein families 
(Selengut et al., 2007) using HMMER 3 (Eddy, 201 1) with e -value 
cutoff of le— 6. PCA and statistical analysis of site group differ- 
ences was performed using the STAMP v2.0 software (Parks and 
Beiko, 2010). The White's non-parametric T-test and ANOVA 
tests were used to test for differences between two site groups 
and multiple site groups respectively. Two-way clustering was per- 
formed using row- standardized (across sites) average TIGRFAM 
category abundance data using the Euclidean distance metric and 
complete-linkage hierarchical clustering in MeV 4.8 (Saeed et al., 
2003) software. Other details regarding TIGRFAM analysis are 
described in this issue (Inskeep et al, 2013). 
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APPENDIX 



Table A1 | Community diversity estimated from 16S sequence libraries. 
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Richness indexes ACE, Chaol (w/95% confidence intervals) and diversity indexes (Fisher's alpha, Shannon-Weaver, and Simpson's Index). 
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40847 reads were recruited to NC_009523 Roseiflexus sp. RS-1 complete genome 



19958 reads were recruited to NC 009767 Roseiflexus castenholzii DSM 13941 complete genome 



27088 reads were recruited to tJC_6l2u32 Chloroflexus sp. Y-400-fl complete genome 



i 



21938 reads were recruited to NC_010175 Chloroflexus aurantiacus J -10-fl complete genome 



16069 reads were recruited to NC_007776 Synechococcus s p. J A-2-3B'a(2-13j complete genome 



10006 reads were recruited to NC_007775 Synechococcus s p. J A-3-3Ab complete genome 



FIGURE A1 | Fragment recruitment of metagenome sequence reads from 
six phototrophic mat samples to the most relevant reference genomes 
(/-axis = 50-100% (nt); x-axis = genome position) currently available in 
reference databases. Site Colors: White Creek (WC_6) = blue; Chocolate 
Pots (CP_7) = brown; Mushroom Spring (MS_15) = green; Fairy Geyser 



(FG_16) = red; Bath Lake Vista Annex (BLVA_5 and 20) = light and dark purple, 
respectively. Reference genomes: Synechococcus sp. B' (JA-2-3B'a) and A 
(JA-3-3Ab); Chloroflexus spY-400-fl and C aurantlacus J-10-fl; Roseiflexus sp 
RS-1 and R. castenholzii DSM 13941 (Plot constructed using JCVI 
bioinformatic utilities; Rusch etal., 2007). 
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FIGURE A2 | BLASTN recruitment of White Creek (WC_6) metagenome sequence reads to representative cyanobacterial genomes. 




FIGURE A3 | Nucleotide word-frequency PCA of assembled sequence 
from Chocolate Pots (CP_7).This community contains predominant 
phylotypes of Roseiflexus-, Synechococcus-, "Ca.J. aerophilum"- (Chlorobi), 
and Leptospirillum-Wke populations as well as minor contributions from the 
Firmicutes, Proteobacteria, and Bacteroidetes [green = Cyanobacteria 



{Synechococcus spp.); gold = Chloroflexi {Roseiflexus spp.); 
maroon = Chlorobi/Thermochlorobacteriaceae/" Ca. T. aerophilum" 
dark-blue = Spirochetes/ Leptospiraceae; 

light-blue = Bacteroidetes/ Flexibacteraceae; yellow = Firmicutes; 
light-purple = Proteobacteria]. 
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FIGURE A4 | Maximum-likelihood phylogenetic tree of Chlorobi PscD 
sequences from metagenomes and sequenced genomes. The tree was 
produced using the program PhyMLfrom PscD alignments containing 172 



positions, with amino acid comparisons corrected using the 
Jones-Taylor-Thornton model (Jones et al., 1992). Bootstrap support for >50% 
of 1000 replicate trees are shown at nodes. 
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FIGURE A5 | G + C Composition of Scaffold Clusters. Scaffold clusters greater than 10 kbp were demarcated using oligonucleotide frequencies as depicted in 
Figure 4. 
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FIGURE A6 | Predominant metagenome sequence assemblies are identified at the class-level (purple = Gamma-proteobacteria; 

[(A) >10 kb contigs and (B) >20 kb contigs only] from Bath gold = class Chloroflexi; yellow = Clostridia, light-blue = Nitrospira; 

Lake Vista Annex (BLVA_20) sampled during a bloom of green = Acidobacteria; maroon = Betaproteobacteria; 

purple-sulfur bacteria (Gamma-proteobacteria). Major clusters dark-blue = Deltaproteobacteria). 



www.frontiersin.org 



June 2013 | Volume 4 | Article 106 | 21 



Klatt etal. 



High-temperature chlorophototrophic microbial mats 



i 



106 BLVA - CP (with 1 WC) 



CP (with 1 MS and 1 FG) 



MS 



V 



SHYH754 
SHYH463 

SHYC573 

Roseiflexus sp. RS-1 
24 FG (with 1 WC) 



14 WC-BLVA 
SHIH533 

- sulfidic hot spring bacterium NPE 

SIA0411 

SHYC578 

WWC- BLVA 

1 2 WC 

' SHHA633 

2 WC 




BLVA green 



uncultured reference clones 



fip P u 



uncultured Chloroflexaceae bacterium 

uncultured Chloroflexaceae bacterium 
uncultured Chloroflexaceae bacterium 

SHYH693 

- Roseiflexus castenholzii DSM 13941 

uncultured Chloroflexaceae bacterium 

uncultured bacterium 

uncultured bacterium 
uncultured bacterium 

uncultured bacterium 



SHYH515 

uncultured Chloroflexaceae bacterium 



J | Bath Lake Vista Annex 

i | White Creek 

| | Chocolate Pots 

[ | Mushroom Spring 

I I Fairy Geyser 




SHYC759 
SHYC566 

3 MS 



19 WC 



SHYC605 
Chloroflexus sp. 396-1 
97 ■ 

Chloroflexus aggregans 
Chloroflexus aggregans 

Chloroflexus aggregans 

uncultured bacterium 

Chloroflexus aurantiacus 

uncultured bacterium 
Chloroflexus aurantiacus J-10-fl 
Chloroflexus sp. Y-400-fl 
| uncultured bacterium 

Chloroflexus aurantiacus 

BLVA purple 

uncultured bacterium 



BLVA purple 



- Candidatus Chlorothrix halophila 
r FG 

uncultured Chloroflexi bacterium 
uncultured Chloroflexus sp. 
Chloronema giganteum 

- Oscillochloris trichoides 



FIGURE A7 | Neighbor-joining phylogenetic tree of Chloroflexi 16S rRIMA 
sequences from all clone libraries. (A) Sub-branch of tree corresponding to 
Roseiflexus spp. (B) Sub-branch of tree corresponding to FAPs related to 
Chloroflexus spp. and other organisms capable of producing 



bacteriochlorophyll c. Sequences are color coded according to spring origin, 
and numbers adjacent to polygons indicate the number of clones in each 
clade. Bootstrap support for >50% of 1000 replicate trees are shown at 
nodes. 
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FIGURE A8 | Hierarchical cluster analysis of relative abundances of 
genes in allTIGRFAMs grouped into functional categories. Data were 
standardized by functional category before clustering to avoid biasing 



analysis by a few categories with high gene abundance. Pearson 
correlation was used as the distance measure for average linkage 
agglomerative clustering. 
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